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In this article, a general problem of sequential statistical inference for general discrete-time 
stochastic processes is considered. The problem is to minimize an average sample number 
given that Bayesian risk due to incorrect decision does not exceed some given bound. We 
characterize the form of optimal sequential stopping rules in this problem. In particular, 
we have a characterization of the form of optimal sequential decision procedures when the 
Bayesian risk includes both the loss due to incorrect decision and the cost of observations. 
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1 INTRODUCTION 

Let Xi, X2, ■ ■ ■ , Xn, ■ ■ ■ be a discrete-time stochastic process, whose distribution 
depends on an unknown parameter 6, 9 € 0. In this article, we consider a general 
problem of sequential statistical decision making based on the observations of this 
process. 

Let us suppose that for any n — 1,2,..., the vector {Xi, X2, ■ ■ ■ , Xn) has a 
probability "density" function 

fe = fe{xi,X2,...,Xn) (1) 
(Radon-Nikodym derivative of its distribution) with respect to a product-measure 

/i" = M ® M ® ■ • ■ ® ^7 

with some cr-finite measure fi on the respective space. As usual in the Bayesian con- 
text, we suppose that fg{xi, X2, ■ ■ ■ , Xn) is measurable with respect to (0, xi , . . . , a;„), 
for any n = 1,2,.... 

Let us define a sequential statistical procedure as a pair (V','5), being a (ran- 
domized) stopping rule, 

and 6 a decision rule 

6 — (6t .6-) 5„. . . . ] . 
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supposing that 

i'n = 'lpnixi,X2, ■ ■ ■ , Xn) 

and 

Sn = Snixi,X2, ■ ■ ■ , Xn) 

are measurable functions, tpnixi, . . . , Xn) £ [0,1], (5„(a;i, . . . , x„) £ {a, decision 
space), for any observation vector (xi, . . . , Xn), for any n = 1, 2, . . . (see, for example, 

m, m, m, m, m)- 

The interpretation of these elements is as follows. 

The value of ipn{xi, . . . ,Xn) is interpreted as the conditional probability to stop 
and proceed to decision making, given that that we came to stage n of the experiment 
and that the observations up to stage n were (xi, X2, . . . , x„). If there is no stop, the 
experiments continues to the next stage and an additional observation x„+i is taken. 
Then the rule '0n+i is applied to xi, . . . , x„, Xn+i in the same way as as above, etc., 
until the experiment eventually stops. 

When the experiments stops at stage n, being (xi, . . . , x„) the data vector ob- 
served, the decision specified by dn{xi, . . . , x„) is taken, and the sequential statistical 
experiment stops. 

The stopping rule ip generates, by the above process, a random variable (ran- 
domized stopping time), which may be defined as follows. Let Ui,U2 ■ ■ ■ ,Un, ■ ■ ■ 
be a sequence of independent and identically distributed (i.i.d.) random variables 
uniformly distributed on [0,1] (randomization variables), such that the process 
{Ui,U2, ■ ■ ■) is independent of the process of observations {Xi, X2, ■ ■ ■)■ Then let us 
say that — n if, and only if, 

Ui > l/'l(^l), • ■ • , Un^l > '(/'n-l(^l, • ■ ■,X.n-l), and Un < ^n{Xi, . . .X„), 

n = 1,2,.... 

It is easy to see that the distribution of r^, is given by 

Pe{T4,^n) = Ee{l-i^i){l-i^2)...{l-^n-i)i'n, n=l,2,.... (2) 

In ijjn stands for ip„{Xi, . . . , Xn), unlike its previous definition as V-'n = 
ipn{xi, . . . , Xn). We use this "duality" throughout the paper, applying, for any 
Fn — Fn{xi, . . . ,Xn) Or F„ — Fn{Xi, . . . Xn) thc foUowiug general rule: when F„ 
is under the probability or expectation sign, it is Fn{Xi, . . . ,Xn), otherwise it is 

Fn (^1 : ■ ■ ■ ) Xn) • 

Let w{9,d) be a non-negative loss function (measurable with respect to {9,d), 
9 £ Q, d £ &) and tti any probability measure on O. We define the average loss of 
the sequential statistical procedure (?A,(5) as 

00 „ 

VF(V',<5) = ^ / [Ee{l ^ ^i) . . .{I ~ i:n^i)^Mw{9,5n)]d^i{9). (3) 

n=l 

and its average sample number, given 9, as 



N{9;i;)=EeT^ 



(4) 
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(we suppose that N{9; tp) = oo ii J2^=i Pe{T^ = n) < 1 in 
Let us also define its "weighted" value 

7VW = j N{e;^)d7r2i9), (5) 

where TT2 is some probability measure on 0, giving "weights" to the particular values 
of 61. 

Our main goal is minimizing N{'ip) over all sequential decision procedures (■(/', S) 
subject to 

W{'ip,6)<w, (6) 

where w is some positive constant, supposing that tti in ^ and 7r2 in ([5]) are, 
generally speaking, two different probability measures. We only consider the cases 
when there exist procedures satisfying 

Sometimes it is necessary to put the risk under control in a more detailed way. 
Let 6i, . . . , 9a; be some subsets of the parametric space such that 8^ C\Oj — 9> if 
i 7^ j, z, J — 1, ... , k. Then, instead of ([5]), we may want to guarantee that 

oo ^ 

W,{ij,S)^y^ ^e(l-V'»-i)...(l-V'n-i)^nW;(^,5„)d7ri(0) <u;„ (7) 

with some Wi > 0, for any i = 1, . . . , fc, when minimizing N{'ip). 

To advocate restricting the sequential procedures by (O, let us see a particular 
case of hypothesis testing. 

Let Hi : 9 — 9i and H2 : 9 — 92 he two simple hypotheses about the parameter 
value, and let 

{1 if 61 = 6I1 and d = 2, 
1 if 6* = 6*2 and d = 1, 
otherwise, 

and 7ri({6'i}) ~ tt, 7ri({02}) = 1 — tt, with some < tt < 1. Then, letting 0; — {9i}, 
i = 1, 2, in (O, we have that 

S) = nPg^ ( reject Hi) = TTa{iJj, S) 

and 

W2iib,S) = (l-7r)Pe. (accept i/i) = (1 - 7r)/5(^, (5), 

where a{ip, d) and P{ip, ^) are the type I and type II error probabilities. Thus, taking 
in ^ wi — TTa, ^12 = {I — 7r)/3, with some a, P € (0, 1), we see that ([7]) is equivalent 
to 

a{tl:,d)<a, and P{ip,S)</3. (8) 

Let now 7r2({0o}) — 1 and suppose that the observations are i.i.d.. Then our problem 
of minimizing N{tp) = N{9o;tp) under restrictions ([U is the classical Wald and 
Wolfowitz problem of minimizing the expected sample size (see [IS]). It is well 
known that its solution is given by the sequential probability ratio test (SPRT), and 
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that it minimizes the expected sample size under the alternative hypothesis as well 



On the other hand, if 7^2({^^}) = 1 with 9 ^ 9q and 9 ^ Oi^we have the problem 
known as the modified Kiefer- Weiss problem, the problem of minimizing the expected 
sample size, under 9, among all sequential tests subject to ([8]) (see [21], [10]). The 
general structure of the optimal sequential test in this problem is given by Lorden 
[T^ for i.i.d. observations. 

So, we see that considering natural particular cases of sequential procedures sub- 
ject to d?]) and using different choices of tti in ([3]) and 7r2 in ([5]) we extend known 
problems for i.i.d. observations to the case of general discrete-time stochastic pro- 
cesses. 

The method we use in this article was originally developed for testing of two 
hypotheses (17], then extended for multiple hypothesis testing problems [15]. An 
extension of the same method for hypothesis testing problems when control variables 
are present can be found in jl4| . 

A more general, than used in this article, setting for Bayes-type decision problems, 
where both the cost of observations and the loss functions depend on the true value 
of the parameter and on the observations, is considered in jl6| . 

From this time on, our aim will be minimizing N{^p), defined by ([5]), in the class 
of sequential statistical procedures subject to ([7]). 

In Section 2, we reduce the problem to an optimal stopping problem. In Section 
3, we give a solution to the optimal stopping problems in the class of truncated 
stopping rules, and in Section 4 in some natural class of non-truncated stopping 
rules. In particular, in Section 4 we give a solution to the problem of minimizing 
N{iIj) in the class of all statistical procedures satisfying Wi{i/j, S) < Wi, i — I, . . . ,k 
(see Remark H]). 

2 REDUCTION TO AN OPTIMAL STOPPING PROBLEM 

In this section, the problem of minimizing the average sample number (O over all 
sequential procedures subject to ([7]) will be reduced to an optimal stopping problem. 
This is a usual treatment of conditional problems in sequential hypothesis testing 
(see, for example, [2], [T^], [3], [I3J)- We will use the same ideas to treat the general 
statistical decision problem described above. 

Let us define the following Lagrange-multiplier function: 



where Ai > 0, i = 1, . . . , fc are some constant multipliers. 
Let A be a class of sequential statistical procedures. 

The following Theorem is a direct application of the method of Lagrange multi- 
pliers to the above optimization problem. 



(see HO], [H]). 



k 




(9) 
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Theorem 1. Let there exist Xi > 0, i = 1, . . . ,k, and a procedure {ip* , <^*) G A such 
that for any procedure {tp, 5) € A 

L{r,5*] Ai, . . . , Afe) < L(^, J; Ai, . . . , A^) (10) 

holds and such that 

W,{',i;\5*) = w^, i = l,...fc. (11) 
Then for any test {ip^S) G A satisfying 

W,{i^,5)<w,, i = l,2,...,fc, (12) 

it holds 

N{ip*) < N{tlj). (13) 
The inequality in il3\) is strict if at least one of the inequalities il'^) is strict. 
Proof. Let (^,(5) G A be any procedure satisfying (fT^ . Because of ITUl) . 

k 

L{r,S*;Xu . . . , Afe) = N{r) + J2 >^^W,{r,S*) < L(^, 5; Ai, . . . , A^) (14) 

i=l 

k k 
= iV(V') + X^W,{^P, S) < N{i^) + (15) 

1=1 2=1 

where to get the last inequality we used (HI]). Taking into account conditions pT|) 
we get from this that 

N{i^*) < iV(?/'). 

To get the last statement of the theorem we note that if N{iIj*) — N{ijj) then 
there are equalities in ([H]) - (jlSp instead of the inequalities, which is only possible 
if Wi(V', 0) = Wi for any i = 1, . . . ,k. □ 

Remark 1. It is easy to see that defining a new loss function w'{0, d) which is equal 
to Xiw{9, d) whenever 9 £ Qi, i = 1, . . . ,k, we have that the weighted average loss 
W{ip,S) defined by ([3]) with w{9,d) = w'{9,d) coincides with the second summand 
in ^. 

Because of this, we treat in what follows only the case of one summand (fc = 1) 
in ([5]), being the Lagrange- multiplier function defined as 

L{i^,6;X)^N{i^) + XW{'tlj,S). (16) 

It is obvious that the problem of minimization of (|16p is equivalent to that of 
minimization of 

R{'iJj,S;c)^cN{'4j) + W{iP,S), (17) 

where c > is any constant, and, in the rest of the article, we will solve the problem 
of minimizing (fT7|) , instead of . This is because the problem of minimization of 
([T7)) is interesting by itself, without its relation to the conditional problem above. 
For example, if 7r2 = tti = tt, it is easy to see that it is equivalent to the problem of 
Bayesian sequential decision-making, with the prior distribution tt and a fixed cost 
c per observation. The latter set-up is fundamental in the sequential analysis (see 
[IS], 0, [7], [22], 0, among many others). 
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Because of Theorem [TJ from this time on, our main focus will be on the unre- 
stricted minimization of R{ip,S;c), over all sequential decision procedures. 

Let us suppose, additionally to the assumptions of Introduction, that for any 
n = 1,2... there exists a decision function 6^ = S^{xi, . . . , a;„) such that for any 

w{0,d)fg{xi, . . .,Xn)dni{0) > J w{e,S^{xi, . . . ,x„))/^(xi, . . . , a;„)d7ri(6') 

(18) 

for /i"-almost all (xi, . . . , a;„). Then is called the Bayesian decision function based 
on n observations. We do not discuss in this article the questions of the existence 
of Bayesian decision functions, we just suppose that they exist for any n = 1,2,... 
referring, e.g., to [12] for an extensive underlying theory. 

Let us denote by Z„ = ln{xi, . . . ,Xn) the right-hand side of (fT8|) . It easily follows 
from nil) that 

Znd/z" = inf J Eewi9,Sn)dTTi{e), (19) 



thus 



J hdfJ.^ > J kdjJL^ >.... 
Because of that, we suppose that 



li{x)dfi{x) < oo 

which makes all the Bayesian risks ([T^ finite, for any n — 1,2, ... . 

Let = {6f , 62 , ■ ■ ■)■ The following Theorem shows that the only decision rules 
worth our attention are the Bayesian ones. Its "if'-part is, in essence. Theorem 5.2.1 

m- 

Let for any n = 1,2,... and for any stopping rule "0 

Sjj = (l-^l)...(l-Vn-l)V'n, 

and let 

St = {{xi, ■..,x„) : st{xi,. .. ,Xn) > 0} 

for all n = 1,2, . . . . 

Theorem 2. For any sequential procedure {'>jj,S) 

00 „ 

Wii;, 6) > Wii^, 5^) = E / ^Indfi^- (20) 

n=l 

Supposing that the right-hand side 0/ \20\) is finite, the equality in is only 

possible if 

''w{9,S,,)f^dM9)= f w{d,6^)f^dTTi{9) 



li^''-almost everywhere on for all n — 1,2, . 
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Proof. It is easy to see that W{^, S) on the left-hand side of has the following 
equivalent form: 

W{^P,S) = £ / ^ / w{d,6n)f^dn,{d)dfi". (21) 

ri=l 

Applying (jlSp under the integral sign in each summand in (j2ip we immediately 
have: 

W{iP,S) >Y. ht w{e,5^)f^dnmdlJ^ = W^(^,5^). (22) 

n=l 

If W{il), S^) < oo, then (gj) is equivalent to 

oo ^ 



n=l 

where 



A„= w{9,S^)f^d7T,{9)- / u;((?,5f)/,"d7ri(0), 



which is, due to (jlSp . non-negative /^"-almost everywhere for all n = 1, 2, ... . Thus, 
there is an equality in (|22p if and only if A„ — /^"-almost everywhere on = 
{s^ >0}foralln = l,2,.... □ 

Because of (IT7| . it follows from Theorem [2] that for any sequential decision pro- 
cedure (-0, S) 

R{'iJj,S;c) > R{il;,S^;c). (23) 

The following lemma gives the right-hand side of (j23p a more convenient form. 
For any probability measure tt on G let us denote 



P"(t^ = n) = j Pg{T^ = n)dTT{e) = J EgstdTT{0), 
for n = 1, 2, . . . Respectively, P'^{t^ < oo) = J2'^=i P^i"^^ — "-): ^-i^d 

E^'t^ = J EgT^dn{e). 

Lemma 1. // 

P"=(t^ <oo) = l (24) 



then 



n=l 

where, by definition, 



oo „ 

R{^, S^'-c) = T.J (cnr + In) dA^", (25) 



r = /"(xi,...,x„)- f^{xi,...,x,,)dn2{9). (26) 
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Proof. By Theorem [5J 

oo ^ 

(5^; c) = ciV(^) + VK(^, 5^) = cN{i^) + / (27) 

n=l 

If now p4l) is fulfilled, then, by the Fubini theorem, 

/oo oo „ 

n=l ri=l'^ 

oo „ ^ „ X oo „ 

SO, combining this with (|27p. we get □ 
Let us denote 

R{^) = R{^;c) = R{^,S^;c). (28) 



By Lemma [TJ 



' oo „ 

Y.J {cnP + ^n)rfM", if i^"^(T^ < oo) = 1, 



oo, otherwise. 

The aim of what follows is to minimize R{ip) over all stopping rules. In this way, 
our problem of minimization of i?(V', 5) is reduced to an optimal stopping problem. 

3 OPTIMAL TRUNCATED STOPPING RULES 

In this section, as a first step, we characterize the structure of optimal stopping rules 
in the class , > 2, of all truncated stopping rules, i.e., such that 

= (V'1,'02, • • ■ jV'Ar-l, 1, • ■ ■) (30) 

(if (1 — tjji) ... (1 — ipn) = /^"-almost everywhere for some n < N, we suppose that 
^fc = 1 for any fc > n, so C = 1, 2, . . . ). 

Obviously, for any tp G 

N-l 

i?(V) = RnW = E / ^n(c"/" + ^«)^/^" + / 4 {cNf'' + In) dfi"", 

n=l •' 

where for any n = 1, 2, . . . 

= tt{xi, ■ ■ ■,Xn) = (1 - -0i(a^i))(l - ip{xi,X2)) ... (1 - -0n-i(a;i, . . . ,X„_i)) 
(we suppose, by definition, that tf = 1). 
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Let us introduce a sequence of functions y„ , n — 1, . . . ,iV, which wiU define 

rN 
N 



optimal stoppings rules. Let = In, and recursively forn = iV— f,iV — 2,...l 



=min{L,Q^}, (31) 

where 

Qn = Qn i^li-- ■,Xn) ^ c/"(xi, . . . , Xn) + J V^^^{xi,.. . , Xn+l)dn{Xn+l) , (32) 

n = 0, 1, . . . , A'' — 1 (we assume that /° = 1). Please, remember that all V^^ and 
implicitly depend on the "unitary observation cost" c. 

The following theorem characterizes the structure of optimal stopping rules in 

Theorem 3. For all e 

RnW > Qo- (33) 
The lower bound in 1133]) is attained by a tp G if and only if 

^{/„<Q^} < i^n < I{l„<Q{;} (34) 

H"" -almost everywhere on 

= {ixu...,Xn):tt{xu...,Xn)>0}, 

for alln = l,2,...,N -1. 

The proof of Theorem [3] can be conducted following the lines of the proof of 
Theorem 3.1 in [17j (in a less formal way, the same routine is used to obtain Theorem 
4 in [H]). In fact, both of these theorems are particular cases of Theorem [31 

Remark 2. Despite that -0 satisfying (|34p is optimal among all truncated stopping 
rules in it only makes practical sense if 

lo = inf J w{9,d)dTTi{9) > Q^. (35) 

Indeed, if (|35p does not hold, we can, without taking any observation, make 
any decision do such that / w{9,do)dTTi{6) < Qq , and this guarantees that this 
trivial procedure (something hke "(-00: c?o)" with R{iljo,do) = f w{9,do)dTri{9) < 
Qo ) performs better than the best procedure with the optimal stopping time in 

Because of this, Vg'^, defined by ((3T|) for n — 0, may be considered the "minimum 
value of -R(V')" , when taking no observations is allowed. 
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Remark 3. When Ti2 in (O coincides with tti in ([3]) (Bayesian setting), an optimal 
truncated (non-randomized) stopping rule for minimizing (jl7|) is provided by Theo- 
rem 5.2.2 in [S]. Theorem [3] describes the class of all randomized optimal stopping 
rules for the same problem in this particular case. This may be irrelevant if one 
is interested in the purely Bayesian problem, because any of these stopping rules 
provides the same minimum value of the risk. 

Nevertheless, this extension of the class of optimal procedures may be useful for 
complying with (jlip in Theorem [T] when seeking for optimal sequential procedures 
for the original conditional problem (minimization of N{il}) given that Wi{4>, S) < Wi, 
i = 1, . . . , fc, see Introduction and the discussion therein). This is very much like in 
non-sequential hypothesis testing, where the randomization is crucial for finding the 
optimal level-a test in the Neyman-Pearson problem (see, for example, jIT]). 



4 OPTIMAL NON-TRUNCATED STOPPING RULES 

In this section, we solve the problem of minimization of R{ip) in natural classes of 
non-truncated stopping rules ^. 

Let 'ip be any stopping rule. Define 

N-l 

RnW = RNi^; ^)=Y. ^ticnp + ln)d^Ji'' + / 4 (c^/"^ + ^n) d^^. (36) 

ri=l 

This is the "risk" p?)) for ip truncated at N, i.e. the rule with the components 

= (V'i,V'2,...,V'iV-i,l,...): RnW^R{^P^). 

Because tp^ is truncated, the results of the preceding section apply, in particular, 
the lower bound of (j33p . Very much like in |17| and in 15J, our aim is to pass to 
the limit, as —> oo, in order to obtain a lower bomid for R{')p), and conditions for 
attaining this bound. 

It is easy to see that V^{xi, . . . , x„) > V^^'^{xi, . . . , Xn) for all N > n, and for 
all (xi, . . . ,Xn), n > 1 (see, for example, Lemma 3.3 in |17)). Thus, for any n > 1 
there exists 

Ki = Ki(a;i, . . . ,x„) = lim V;f (xi, . . . , x„), 

{Vn implicitly depend on c, as V^f do). It immediately follows from the monotone 
convergence theorem that for all n > 1 

lim Q^{xi, . . . ,Xn) = Cf^ixi, . . . ,Xn) + Vn+l{xi, . . . ,Xn+l)dfl{Xn+l) (37) 
JV— ►oo J 

(see (IS21)). Let Qn = Qnixi, . . . ,Xn) ^ limAr^oo Qn (Xl, - ■ ■ ,Xn)- 

In addition, passing to the limit, as ^ cxd, in ((3T|) we obtain 

K = niin{?„,Q„}, 71 = 1,2,.... 

Let now ^ be any class of stopping rules such that tp ^ entails Rn{iP) Ri^P), 
as iV ^ 00. It is easy to see that such classes exist, for example, any has this 
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property. Moreover, we will assume that all truncated stopping rules are included 
in ^, i.e. that U^>-^ C ^. 

It follows from Theorem |3] now that for all ?/» e ^ 

RW > Qo- (38) 

The following lemma states that, in fact, the lower bound in (I38p is the infimum of 
the risk i?(V') over i/j e 

Lemma 2. 

Qo - inf Riij). 

The proof of Lemma [2] is very close to that of Lemma 3.5 in [17] (see also Lemma 
6 in [T3]) and is omitted here. 

Remark 4. Again (see Remark if tti = 7T2, Lemma [5] is essentially Theorem 
5.2.3 in (see also Section 7.2 of [8 ) . 

The following Theorem gives the structure of optimal stopping rules in 

Theorem 4. // there exists ip € such that 

RNj) = inf R(ip'), (39) 

then 

I{K<Q.} <^n< I{i^<Q^} (40) 
-almost everywhere on for all n — 1,2,.... 

On the other hand, if a stopping rule ■0 satisfies \4U^ -almost everywhere on 
Tn for all n — 1,2,..., and ip G ^ , then ip satisfies i39\) as well. 

The proof of Theorem d] is very close to the proof of Theorem 3.2 in 17J or 
Theorem 6 in [15] and is omitted here. 

It follows from Theorem|3]that "-0 e J^" is a sufhcient condition for the optimality 
of a stopping rule ^ satisfying In the hypothesis testing problems considered 

in [T7] and in [TS], there are large classes of problems (called truncatable) for which 
Rn{'>P) ~^ -R(V'): as — > oo, for all stopping times ip. In this article, we also identify 
the problems where this is the case. 

Let us say that a stopping rule is truncatable if Rn{iP) Ri"^), as N oo. It is 
obvious that all truncated stopping rules are truncatable. In particular. Theorem |4] 
holds when ^ is the set of all truncatable stopping rules. 

The following Lemma gives a necessary and sufhcient condition for truncatablity 
of a stopping rule. 

Lemma 3. A stopping rule with R{'ip) < oo is truncatable if and only if 

J t%lNd^i^ ^0, as TV ^ oo. (41) 
If R(jp) — oo, then RnO^) oo, N oo. 
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Proof. Let ip be such that R{ip) < oo. 
Suppose that (gT]) is fulfilled. Then, by 

RW-RnW= J cticnr + ln)dfi" ~ c j4Nf^dfi^+ f tpNdfi^ . (42) 

n—N 

The first summand converges to zero, as — > oo, being the tail of a convergent 
series (this is because Riip) < oo). 

The third summand in (H^ goes to as ^ oo, because of (|^T|) . 
The integral in the second summand in ((42l) is equal to 

NP^^T^ >N)< E^'T^I{r^>N} 0, 

as iV ^ OO, because E'^^t^ < oo (this is due to R{ip) < oo again). 

It follows from (|42p now that i?Ar('(/') — > Ri^p) < oo as ^ oo. 

Let us suppose now that Rn{iP) Ri.'i') < oo as ^ oo. For the same reasons 
as above, the first two summands on the right-hand side of tend to as A^ ^ oo, 
therefore so does the third, i.e. (|4ip follows. The first assertion of Lemma[3]is proved. 

If R{ip) ~ oo, this may be because P'^^{t^ < oo) < 1, or, if not, because 



n=l 

In the latter case, obviously, 

N-l 



Y J sfAcnr + ln)dijJ' = oo. 



as N ^ oo. 



n=l 

In the former case, 



RnW >cJ tlNrdfi'" = cNP'^^T^ >N)^OQ, 

as A^ ^ oo, as well. □ 

Let us say that the problem (of minimization of R^ip)) is truncatable if all stopping 
rule tjj are truncatable. 

Corollary [1] below gives some practical sufficient conditions for truncatability of 
a problem. 

Corollary 1. The problem of minimization of R{tj}) is truncatable if 
i) the loss function w is bounded, and 

E(V') < oo implies that P''^(r^ < oo) = 1, (43) 



or 



ii) 

y/^d/x^^O, (44) 

as A^ ^ oo. 
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Proof. If w{d, d) < M < oo for any 6 and d, then, by the definition of In, 

ttflNdfi^ <M [t%( [ fg^dn, {9)] = MP-i (r^ > N). (45) 



If now R{ip) < oo, then by the right-hand side of (|^5)) tends to 0, as — * cxd, 
i.e. (dH) is fulfiUed for any ip such that R{ip) < oo. Thus, by Lemma [3] any ip is 
truncatable. 

If (|44|) is fulfilled, then (|4T]) is satisfied for any ip. Again, by Lemma [3] any ^p is 
truncatable. □ 

Remark 5. Condition i) of Corollary [1] is fulfilled for any Bayesian hypothesis 
testing problem (i.e. when tti = 7r2 = tt) with bounded loss function (see, for 
example, [T7] and [TS]). Indeed, in this case R{ip) < oo implies E^r^p < oo, so, in 
particular, P'^{t^ < oo) = 1. 

Remark 6. It is easy to see that Condition ii) of Corollary [1] is equivalent to 

Egw{e,d^)dTri{0) ^0, N -*oo, 



i.e. that the Bayesian risk, with respect to the prior distribution tti, of an optimal 
procedure based on sample of a fixed size N , vanishes as — > oo. This is a very 
typical behavior of statistical risks. 

The following Theorem is an immediate consequence of Theorem 21 

Theorem 5. Let the problem of minimization of R{ip) be truncatable, and let ^ be 
the set of all stopping rules. Then 

R{4>) = inf R{4>') (46) 

if and only if 

I{U<Q,,} < V'n < -f{;„<Q„} (47) 
^"^ -almost everywhere on for all n — 1,2, ... . 

Remark 7. Once again (see Remark [2]), the optimal stopping rule t/j from Theorem 
[5] (and Theorem |4]) only makes practical sense if Iq > Qa — ini^^^ R{i/j), because 
otherwise the trivial rule, which does not take any observation, performs better than 
"0, from the point of view of minimization of R{ip). 

Remark 8. Combining Theorems [U [21 and El we immediately have the following 
solution to the conditional problem posed in Introduction. 

Let Ai, . . . , Afc be arbitrary positive constants. Let 6^ , n — 1,2, . . . be Bayesian, 
with respect to tti, decision rules for the "loss function" 



w'{0,d) = Y,^^MS.d)IeA9), 
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i.e. such that for all d G 
fc 



Va, / w{9,d)f^d7T,{9)>l,,^y2^^ f y^iO,S^)f^dn,{9) (48) 



^"■-almost everywhere (remember that (5„ = (5„ (xi, . . . , Xn) and fg = fg{xi, . . . , x„)). 

For any > 1 define = Iat, and = min{/„,Q^} for n = iV - 1,7V - 
2, . . . , 1, where = /" + J l„+ld^i{xn+l), with /" = f^dn^ie). 

Let also K = limAr^oo and (5„ = hmjv^oo Qn i = 1; 2, 

Suppose, finally, that the problem is truncatable (see Corollary [T] for sufficient 
conditions for that). 

Let ip be any stopping rule satisfying 

/{/„<Q,.} < ^« < ^{/„<Q,.} (49) 

/j,"-almost everywhere on T,f for all n = 1, 2, . . . . 

Then for any sequential decision procedure (tp' , S) such that 

W^i^',S) <W,i^P,d''), z = l,...,fc, (50) 

it holds 

N{ip) < N{ip'). (51) 

The inequality in ([?T|) is strict if at least one of the inequalities in ([50)1 is strict. 
If there are equalities in all of the inequalities in ([50]) and (|5ip . then 

/{/„<Q„} < ^; < /{/„<Q„} (52) 

/i"-almost everywhere on for all n = 1, 2, ... , and 

A, rt A, rt 

^=l -^e. ^=1 -^e. 

/i"-almost everywhere on S"^ for all n = 1, 2, ... . 

For Bayesian problems (when tti = 7r2 = tt) Theorem [5] can be reformulated in 
the following equivalent way. 
Let 

_ In ^ U,fMS,S^)d7Tm 

be the posterior risk (see, e.g., [1]). Let = Rn{Xi, . . . , Xn), and recursively for 
n = 7V-l,iV-2,...,l 

(Xi, . . . , X„) = min{i?„(Xi, . . . , Xn),q^{Xi, . . . , X„)}, 

where 

q^{Xi,...,Xn)^C + E^V^+,\Xi,...,Xn} 
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{E'^ stands for the expectation with respect to the family of finite-dimensional den- 
sities /" = /q fgdTT{6), n = 1,2, . . . , meaning, in particular, that 

ipTT f N I 1 f "^^n+li^li ■ ■ ■ i^n+l)f^'^^{xi, ■ ■ ■ ,Xn+l) , . 

E {Wn+iFi, . . . ,a;„} = / — r rf^(a;„+i)). 

J J V^l^ ■ ■ ■ I '■^n) 

Let, finally, ^ w„(Xi, . . . , X„) = fimjv^oo {Xx, ■ ■ ■ , Xn), and 

Qn ^qn{Xi,...,Xn) ^ HmN -,oo Qn i^l ^ ■ ■ ■ i ^n) , n= 1,2,.... 

Then, the following reformulation of Theorem [5] gives, for a truncatable Bayesian 
problem, the structure of all Bayesian randomized tests (cf. Theorem 7, Ch. 7, in 

Theorem 6. Let the problem of minimization of R{ip) be truncatable, and let be 
the set of all stopping rules. Then 

R{i>)= ini^Rm (53) 

if and only if 

I{R^<q^} <fPn< I{R„<q„} (54) 

P'^ -almost surely on for all n — 1,2, ... . 

Remark 9. More general variants of Theorem [6l for cases when the loss function 
due to incorrect decision is of the form w{6, d) — Wn{9, d;xi, . . . , x„) and/or the cost 
of the observations (a:i, . . . , x„) is of type Kg{xi, . . . , Xn), can easily be deduced from 
Theorem 4 |16| . In particular, this gives the structure of optimal sequential multiple 
hypotheses tests for the problem considered in Section 9.4 of [22\ . 

Remark 10. Theorem |6l in particular, gives a solution to optimal sequential hy- 
pothesis testing problems considered in [6] and [5] (where the general theory of 
optimal stopping is used, see [H or [Hj). See [T^ and [TS] for a more detailed 
description of the respective Bayesian sequential procedures. 
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